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Abstract 



Freeman, Robert R., and Pauline Atherton, Final Report o_f the Research 
Project for the Evaluation of the UPC as the Indexing L anguage for_a 
Mechanized Reference Retrieval System , Report AIP/UDC -9 under NSF Grant 
GN-455. New York, American Institute of Physics, May 1 , I968, 

The background, objectives, and accomplishments of the project are 
reviewed briefly. Specific areas discussed are English-language UPC 
schedules, a computer-bas )d UPC file management system, data bases for 
retrieval experiments, batch-process pjid on-line, interactive information 
retrieval systems, and retrieval system evaluation. The conclusions deal 
with the usefulness of the UPC for mechanized retrieval systems, needed 
research, needed organizational effort, and a proposed international 
seminar. Several appendices summarize the current state of the UPC in 
English and the availability of magnetic tape and microfilm files developed 
by the project. 



UPC 025.5+025.45UPC+651.83.012.1 
Explanation of UPC numbers: 

025,3 Cataloging and indexing - Information retrieval systems 

O25.45UPC Pecimal classifications - UPC 

651,83 Indexing and retrieval methods 

, 012,1 Experimental testing and evaluation 
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Final Report of the Research Project for the Evaluation of the UPC 
as the Indexing Language for a Mechanized Reference Retrieval System 



i 



Robert R. Freeman and Pauline Atherton 



1 . Introduction. The development of modem data processing equipment and 
techniques, beginning about two decades ago, coincided with a crisis in the 
handling of scientific information by libraries and information centers. 

Increasingly large and rapidly growing document collections and changing 
user needs, brought on by the burgeoning of scientific research, placed 
difficult stresses on existing science information systems. 

The Universal Decimal Classification (UDC) had been developed and 
promulgated since the beginning of the twentieth century, as a means of 
classifying and indexing documents in any field of knowledge, but especially 
in science and technology. While UDC gained wide acceptance, especially in 
Europe, the questions of whether it could be used in newer, mechanized 
information systems and whether it could be kept up-to-date in the face 

of rapid change were naturally raised. | 

i 

In the United States, The National Science Foundation supported pioneering 

attempts to answer these questions, begun as early as I96I by Malcolm Rigby 1 

j 

at the American Meteorological Society (AMS). These efforts led to the j 

i 

demonstration of techniques for computer handling of UDC schedules and the j 

preparation of computer-printed systematic indexes based on UDC. An experi- j 

mental current-awareness service. Meteorologic al and Geoastrophysical Titles, I 

incorporated this type of index, for which the name UNIDEK was coined. AMS 
has continued to produce UNIDEK indexes of the accessions of the U. S. 

National Oceanographic Data Center, while the American Geological Institute 
also adopted the techniques for annual indexes to Geoscience Abstra^t^. ] 

By 1965, it still remained to explore and demonstrate the use of UDC j 

in a mechanized retrieval system and to cope with the problem of managing 
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the UDC itself, which had never reached the stage of full publication in 
the English language. The UDC Project of the American Institute of Physics 
(AIP) was created by the National Science Foundation (NSP) to address these 



2. Objectives . The objectives of the project were given in the proposal 
submitted by AIP to NSP, as follows: 

"We propose to design and demonstrate a reference 
retrieval system in which the coding subsystem is the UDC 
and the display and search subsystem is at least partially 
mechanized. The objective is to evaluate the ability of 
the UDC to present relevant references to the user in a 
useful display and to screen out the irrelevant references. 

The design of the retrieval tests using this mechanized UDC 
system and other mechanized systems will be carefully 
constructed in order to insure comparable results and 
proper assessment of relevance by user group representatives. 

A proposed standard description for evaluation tests will 
be followed. 

The experimental system proposed here is intended to 
demonstrate the capabilities of the UDC as an indexing 
language as it is being applied in real life situations. 

As a part of the feedback from this study, we would expect 
to learn about areas in which the UDC might be improved with 
regard either to conceptual relations or to notational devices 
when it is used in a mechanized system.” 

The plan of work for accomplishing these objectives is shown in Figure 1. 
The following section describes the results of these steps. 

5 . SnTnmary of Results . The results are described in terms of major areas 
of accomplishment. They are (l) the UDC schedules, (2) the UDC File 



Management system, (5) the data bases for retrieval system experimentation, 
(4) the batch process retrieval system, (5) the interactive retrieval system, 
and (6) retrieval system evaluation. Each of these areas is described in 
detail in reports published by the project or in appendices of this report. 



problems 



Consequently, only brief summaries with references are presented here 









O 







FIGURE ! 

RESEARCH PROJECT FOR THE EVALUATION OF THE UDC AS THE INDEXING LANGUAGE 

IN A MECHANIZED REFERENCE RETRIEVAL SYSTEM 
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3«1» UPC Schedules . English language schedules from both published editions 
and manuscri^it sources were entered into a single machine-readable file. 

The entire UDC is represented to at least the level of the English Abridged 
Edition, while approximately 95^ is represented to at least the level of a 
Medium Edition. Appendix 5 shows the UDC classes which are present in the 
file. Appendix 5 gives details of the availability of the complete master 
file on both magnetic tape and microfilm. 

By special arrangement with the International Federation for Documentation 
(fid) , a signif icant beginning was made toward the development of an English 
Medium Edition of UDC. The state of this effort and the arrangements for 
availability of the manuscript are described in Appendix 4 . 

5.2. The UDC File Management System . Building on techniques developed by AMS, 
we developed a comput'er-based file management system, which was used to 
maintain the master file referred to above. The program also provides for 
automatic generation of a keyword index to the schedules and for input to 

a photocomposition system. With the help of a subcontractor, we were able 
to demonstrate that high-quality composition of classification schedules 
directly from a machine -readable file is possible. The file management 
system and techniques, including photocomposition are discussed in project 
reports 2 and 3 . 

3.3. Data bases for experimentation . Six files of UDC-indexed document 
references were created and converted to magnetic tape for use in the 
experimental retrieval systems described in sections 5«4» and 5«5« Tape 
numbers, contents, and availability are described in Appendix 2 . The files 
consist of the following data. 



- 5 " 









Nuclear science . The U.S. Atomic Energy Commission provided a tape 
containing the descriptive cataloging and Euratom Keyword indexing data 
for Nuclear Science Abstracts , vol. 19 > June 15 > 1965* The file 

contained 2,330 references. 

A team of experienced CDC indexers was assembled from among the staff 
Qf several installations of the United Kingdom Atomic Energy Authority 
(UKAEA), under the supervision of Mr. Jack Terry. This team provided UDC- 
indexing records for each document reference in the collection. These 
records were then merged into the file. Indexing was done according to the 
Special Subject Edition of UUC for Nuclear Science and Tochnologs^ augmented 
by a code of practice developed by UKAEA. 



^.5o2. Geology . A file of 20,892 document references, consisting of abstract 
numbers , titles , and UDC numbers , from Geoscience Abstracts , volumes 6-8 
(1964-1966), was contributed in machine -readable form by the American 
Geological Institute. 

Meteorology . A file of approximately 9 >000 document references, 
consisting of authors, subject headings, and abstract numbers, was contributed 
in machine -readable form by the American Meteorological Society. The data 
were from the 19^5 volume of Meteorological and Geoastro physical Abstract^ 
( MGA) . The AIP/UUC Project added UDC numbers from the printed to 
complete the file. 



^.^.4. Oceanography . A file of 3>900 document references, consisting of 
authors, titles, references, and UDC numbers was contributed in machine- 
readable form by the American Meteorological Society. The data represent 
the 1966 and January-March, 19^7 accessions of the National Oceanographic 



Data Center. 



.ERIC 
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^.^.5. Antarctic Studies . A file of 4,000 document references, consisting 
of abstract numbers, titles, and UDC numbers was created from the 
Antarctic Bibliography , published by the Library of Congress under the 
sponsorship of the Office of Antarctic Programs of the National Science 
Foundation. The data represent abstracts originally published from 1962 -I 966 

^.5.6. Metallurgy File . A file of 9,159 document references, consisting 
of abstract numbers, titles, and UDC numbers, was created from a card file 
of abstracts contributed by the Iron and Steel Institute, located in London. 
The data were originally published as the 19^5 coverage of the Abstract and 



Book Title Card Service (ABTICS) of that organization. 

5.4. Batch Process Retrieval System . An existing package of computer 
programs for the IBM I 4 OI, known as the Combined Pile Search System, served 
as the mechanism for testing and evaluation of the UDC, as described in 
section 3 . 6 . The details of the operation of this retrieval system were 
presented in nro.iect report number 3_ . 

5.5. Interactive Retrieval System . The feasibility of use of UDC in an 
on-lin^ interactive retrieval system was demonstrated with a system developed 
by a subcontractor, using the nuclear science data base (see 3*3*1*) and 
the Special Sub . iect Edition of UDC for Nuclear Sc ience and Technology. The 
results of the use of this system, referred to as "AUDACIOUS” (Mtomatic 
Direct ^cess to information with the ^-Line TIDC System), are documented 

in T)ro.iect report number 7 * 

5.6. Retrieval System Evaluation . Several experiments were undertaken for 



I 

■J 

•2 



1 




j 



the purposes of demonstrating the use of UDC in a computer-based retrieval 
system and evaluating the degree of success which might be expected for 
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such applications. For the field of metallurgy , tests performed with the 
cooperation of the Iron and Steel Institute (London) and the American 
Society for Metals are reported in project report number _6 . In the field 
of nuclear science , tests performed in cooperation with the U. S. Atomic 
Energy Commission, Euratom, and UKAEA are reported in project report 
number 8 . 

4. Conclusions and Recommendations . 

4.1. Usefulness of HOC . The purpose of the work of this project has been 
to bring the light of practical experience to bear on an area in which there 
had previously been only speculation, thus providing guidance for the 
information system planner who is faced with deciding what role, if any, 
the HOC should play in a particular system. 

There is no longer any doubt that the ULC can be used as the indexing 
language in a mechanized system. No barriers exist to the successful use 
of the ULC in either a batch-processing or interactive mode. 

The results of the project should lend support and encouragement to 
those who will consider use of ULC in computer-based retrieval systems. 
jjq insoluble problems were found, but the long— existent matter of the theory 
according to which the ULC will be developed in the future is seen to be 
accentuated by the requirements and capabilities of computer-based systems. 

On the basis of experiments in a test environment which reasonably 
simulates a real information system, we feel justified in encouraging those 
who wish to make use of ULC as the indexing language in a computer-based 
retrieval system. To the extent that the observed results are reliable, 
valid, and indicative, the operating characteristics of the experimental 
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batch— prcc0ss systGin ar© surprisingly good} ©spscially wh©n on© r©calls 
that th© ind©xing for all fil©s d©scrib©d in s©ction 3* 5* was don© with a 
pux'^ly manual syst©m in mind. 

Th© r©sults, particularly th© failur© analys©s, r©v©al©d som© points 
which should b© s©riously consid©r©d by syst©m d©sign©rs and manag©rs who 
int©nd to us© IJ])C as th© ind©xing languag© in th©ir syst©m. Th©s© points, 
which may b© group©d as (1) s©arch strat©gi©s and pr©dictive tools, (2) 
hisrarchical ssarching, (5) new indexing policies, and (4) revisions and 
innovations in the UDC, are discussed in. pro.iect report number 6 . 

AUDACIOUS was, to the best of our knowledge, the first on-line inter- 
active retrieval system in which one of the widely used traditional 
classification and indexing tools was used. V/hile the UDC was the tool in 
this case, the success of the experiment may be generalizable to other tools, 
such as the Dewey and Library of Congress Classifications. 

For system designers, clearly, the most important implication of the 
results of AUDACIOUS is the need for careful consideration of the user 
viewpoint in all facets of the design of an interactive retrieval system. 

A system which is a technical success can fail to impress an information 
system user in many areas, some of which we have discussed in pro.iect report 
number 7 . 

In general, system planners will want to consider the UDC if there are 
compelling reasons. Several such reasons might be mentioned here, (l) An 
organization, through many years of use, may have built up large files and 
a skilled staff based on the use of UDC. (?) The ability to use UDC could 
save the not-insignificant cost of developing an indexing language. (5) The 
idea of an internationally used indexing language may have appeal for 
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organizations for whom international exchange of materials in 
several natural languages is important. In conditions such as these, 
no harrier exists to the successful use of UIC in a mechanized retrieval 

system. 

Improvements during the past few years in communications and computer 
technologies strongly indicate that networks of libraries and information 
centers, whose resources are linked electronically, will be feasible in the 
not-too-distant future. Users will be able to conduct searches by means 
of a dialogue with the system, with access to distant as well as geographi- 
cally-nearby files of information. Such networks need not be confined within 
national borders; Dubon\ for example, has outlined a possible European 
Information Network-^, in which various national centers, each specializing 
in a given subject area, would exchange information cooperatively. 

The very concept of an international network raises the question; 
what manner of indexing would serve adequately for users who do not share 
a common natural language? One solution is to use the language in which thte 
largest volume of literature is written, i.e. English. This solution 
undoubtedly serves well in a situation in which the user must submit his 
question through an intermediary analyst who is skilled in both the subject 
matter and in English. However, it is open to question whether the average 
non-native speaker, even though he may be able to converse with another 
person in English, would be able to carry on a successful dialogue with a 



computer-based information file. 

Another solution might be to make use of a form of indexing that is 

not depen dent on natural language - which suggests the TJDC. Without 

1. R.J. Uubon, "Implementation of an International Information Retrieval 
Ppnter" nt) 539-346 in Progress in Information Scie nce and Technoloarj. 

LrAm.ri^Sfei7entatlon Institute , Yol^, Santa M^ca, 

California, Adrianne Press, I 966 . 
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commenting on the present adequacy of the UDC, it should he emphasized 
again that tables for conversion between UDC and natural language already 
exist for some sixteen languages and that UDC-indexing appears with 
original research papers in possibly hundreds of journals published in many 
countries and languages. Kepple^ has pointed out the advantage of the UDC 
as a tool for an inteimtional library because it is not language -dependent. 

A third solution, requiring greater effort to implement it, would be 
to permit indexing and searching to be done using a controlled natural- 
language vocabulary of local choice. A part of the system would then be 
a table of equivalences between the UDC and the natural language vocabulary. 

The result would be to take advantage of the hierarchical notation of the 
UDC without even requiring that the user be familiar with the UDC. In 
addition, since the UDC would be the internal form of indexing, users in 

any center could direct queries to the file, without regard to the original 

/ 

language in whicn the indexing was done. 

4.2. Needed Research . A critical need continues to be methods for 
evaluation of information retrieval systems, especially those which may be 
employed during the design of a system in order to maximize the chances of 
its success according to some criteria. At the time this project commenced 
work, quantitative performance measures applied in a post hoc fashion 
reflected the state-of-the-art. 

Since that time, there has been progress in three distinct areas. First, 
the techniques of statistical sampling and inference are coming to be applied 

2. For a list of UDC editions, see International Federation for Documentation, 
FID Publications Catalogue 1968 . FID 42? » The Hague, January, 1968. 

5. R.R. Kepple, "Serving Readers in a Special International Library", 

College and ResGarch Libraries > 28 (3) > 205*207) 216 (196?) • 
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to information systenjso as to enable predictions of performance to be made 
before full-scale development of a system. Second, there has been the 
development of evaluation models which take into account, at least 
qualitatively, the behavioral factors involved in the act of judging the 
relevance of documents retrieved by a system. Finally, there is increasing 
realization that behavioral factors of man-system relationships may be more 
significant than quantitative performance in the evaluation of information 
systems . 

We have attempted to make use of these advances, if only crudely at 
times, in compiling the results of this project's work. However, feeling 
keenly the lack of sufficient and integrated methods, we strongly urge 
that further work along this line be carried out. Methods for the design 
and evaluation of systems in which the user interacts directly with a 
machine-stored index or document file are especially needed. 

We have presented a detailed discussion of the difficulties which arise 
at the time UHC-indexed files are searched by machine which are attributable 
to the structure and class definition system of the UDC. The problems of 
revisions and innovations reflect a deep-rooted question for the International 
Federation for Documentation: can the UDC be universal in the sense of being 
applicable to all types of information systems? Are the requirements of 
organizations which will use the UDC for the purpose of systematic single- 
entiy document file organization (e.g. conventional libraries) compatible 
with those of organizations which will offer services based on deep indexing, 
highly specific questions, and the use of the computer as an aid? 

From the point of view of the latter type of system, continued research 
into the revision of UDC according to principles and techniques of faceted 



classifications seems to be indicated. We also recommend the testing of 
more sophisticated devices for coding syntagmatic relationships » such as 
the schema of relators suggested by Perreault. 

Another suggestion we would make for future investigation is to explore 
the use of UPC in conjunction (rather than in parallel, as we did) with a 
suitably detailed thesaurus. UPC might be used to rapidly narrow the 
portion of the file to be searched to a small size, the thesaurus then 
being used for detailed interaction with that subset of the file. The 
problem of user preferences for natural language versus a numeric or other 

code also needs to be investigated. 

Finally, our experience with AUPACIOUS ( project report number l ) and 
several other similar systems points to a need for research on methods of 
teaching the use of on-lino , iS^d'^ctive retrieval systems. The similarities 
of many aspects of this type of retrieval system with computer-aided 
instruction (CAl) suggest that the latter may provide a fruitful avenue 
of exploration toward a solution of this problem. 

4.5. Needed Organizational Effort . Probably the most frequently-heard 
criticism of UPC is that it is not up-to-date in its coverage of the 

©volving terminology of various technical subject areas . The problem 
has two aspects. One is the international voluntary committee system by 
which the need for change is communicated to the International Federation for 
Pocumentation (FIP) and its Central Classification Committee. The other is 
the technical difficulty of maintaining and disseminating up-to-date 
classification schedules in many languages. The latter reflects back on 
the former in that it is often difficult even for a revision committee to 
obtain the most recent and complete version of the schedules for its 















-15- 



specialty. In project report number 5 . we summarized the problem in 
quantitative terms, as follows: 



"Quantitatively speaking, if we assume that a 
reasonable goal for the UDC is to be available in a 
complete form in a total of twenty languages, the UDC 
would be a file of approximately the following size: 



125,000 

20 

2,500,000 

200 



500,000,000 



records in a full edition 

languages 

records 

characters per record, including 
UDC number, heading, cross references, 
and alphabetic index entries 
characters in total UDC file 



The problem of the UDC would be to keep such a 
file up to date, disseminate changes to users rapidly, 
and select various portions of the file to be printed 
periodically, according to managerial decisions as to 
needs for new full editions, abridgements, and special 
subject editions. A further complication of no small 
magnitude is that the encoding and display mechanisms 
must provide for all of the orthographic forms commonly 
used by the twenty (or more) languages, as well as 
mathematical symbols.” 



The technical aspect of the problem can be solved, as demonstrated by 
the work reported in project reports 2 and 5 and by the work of the staff 
of the Zentralstelle fur maschinelle Dokumentation in Germany. An effort 



is needed on the part of FID or its national members to develop a permanent 
base and an expert staff to proceed toward the solution. 



4.4. Seniinar on UDC in a Mechanized Retrieval System . One of the most 
concrete, yet difficult to document, results of this project is an operational 
information retrieval system capable of using UDC as the indexing language 
(the Combined File Search System). Anticipating requests from potential 
users for instructions on how to operate the system, we suggested a one- 
week seminar at which the entire process could be reviewed in step-by-step 
detail, with demonstrations on a computer. The suggestion was accepted by 
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the FID Classification Research (PID/CR) and Central Classification (PID/CCC) 
Committees. The Danish Centre for Documentation, located at the Technical 
Library of Denmark, has agreed to act as a secretariat and to provide 
meeting facilities. The North European University Computer Center likewise 
agreed to provide computer time. The seminar is scheduled tentatively for 
September 2-6, 1968. 

5 . Acknowledgement s . The AIP/UDC Project benefitted from the splendid 
cooperation of many individuals and organizations, spread over the United 
States, Canada, and many parts of Europe. VHiile their names are too 
numerous to list individually, their services as consultants, advisors, 
subcontractors, and friends of the project are nonetheless sincerely 
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APPENDIX 1 

Bibliography of AIP/UDC Pro.iect Reports, with Abstracts 



1. Freeman, Robert R., Research Pro.iect for the Evaluation of t he UPC as 
the Indexing Language for a Mechanized Reference Retrie val System; An 
Introduction t New York, American Institute of Physics, Report AIP/DRP 
TJDC-l, October 1, I965. NSP Grant GN-453* 

The report describes the five areas of activity which lead toward the aim 
expressed in the title: (l) to develop a complete English-language version 
of DDC in both hierarchical and alphabetical arrangement in machine-readable 
formj (2) to develop techniques for automatic file maintenance and photo- 
composition of UDC editions; (3) to develop a computer-based reference 
retrieval system which uses UDC as its indexing language; (4) to collect a 
set of UDC-indexed document files in machine -readable form in various 
subject areas; and (5) to conduct tests with the aid of experimental user 
groups which will lead to an evaluation of the UDC in the desired context. 

Data are also given on the organization of the project, 

2, Freeman, Robert R,, Research Project for the Evalua tion of the UDC as 

the Indexing langua ge for a Mechanized Reference R etrieval System: Progress 
Report for the Period July 1, 1965 - January 21, I966 , New York, American 
Institute"of Physics, Report AIP/DRP UDC-2, February 1, I966, NSF Grant 

GN-433* 

The report reviews activities involving collection of English, French, and 
German schedules of the Universal Decimal Classification (UDC), translation 
of some schedules, further development of a mechanized (IBM I4OI) UDC file 
maintenance system, experiments with automatic alphabetic indexing of UDC 
schedules, automatic typesetting and composition of UDC schedules, selection 
of equipment and rules for keyboarding the UDC into machine readable fo-cin, 
and initial steps toward collections of UDC-indexed documents and a retrieval 
system for test and evaluation purposes. Detailed appendices deal with 
considerations of creating machine -readable UDC records on punched-paper tape 
for subsequent computer processing, 

3, Freeman, Robert R,, Modem Approaches to the Ma nagement of a Classificati^, 
Report AIP/UDC-3 under National Science Foundation Grant GN-433> New York, 
American Institute of Physics, October 1, I966, Presented at the Seminar 
on TJDC and Mechanization at the 32nd Conference of the International 
Federation for Documentation, the Hague, September 20, I966, Also 
published as "The Management of a Classification: Modem Approaches 
Exemplified by the UDC Project of the American Institute of Physics," 
Journal of Documentation ! 23(4) i 304~320 (December, 1967)» 

The report views the problem of managing a classification, such as the 
Universal Decimal Classification (UDC), as an example of the broader class of 
problems known in the system analysis and data processing field as "file 
management". The characteristics of file management are listed and related, 
specifically to the UDC, The uses of data processing equipment for the 
creation, maintenance, manipulation and display of files are discussed. The 
development of a prototype file management system for the UDC is reviewed. 
Appendices illustrate the progress of the project and summarize the present 
status of the UDC in the English language. 










4. Russell, Martin, and Freeman, Robert R., Computer-Aided Indexing of a 
Scientific Abstracts Journal by the UPC with UNIDEK; a Case Study , 

Report AIP/in)C-4 under Rational Science Foundation Grant GN-453» New 
York, American Institute of Physics, April 1, 1967* 

This paper is a case study of the adoption by Geoscience Abstracts of UNIDEK, 
a novel computer-compiled systematic subject index based on the Universal 
Decimal Classification \|UDC) of the International Federation for Documentation 
(fid). Events leading to a decision to adopt the system, some theory of 
indexes, problems involved in conversion, and some of the results achieved 
are reviewed. 

5. Freeman, Robert R. and Pauline Atherton, File Organization and Search 
Strategy Using the Universal Decimal Classification in Mechanized 
Reference Retrieval Systems , Report AIP/UDC-3 under National Science 
Foundation Grant GN-455, New York. American Institute of Physics, 

September I5, 196?. Presented at the FID/IFIP Conference on Mechanized 
Information Storage, Retrieval, and Dissemination, Rome, June 15, 1967* 
Published in Proceedings of the Conference , North Holland Publishing 
Co., (forthcoming) . 

Starting from a model of contemporary mechanized retrieval systems and the 
characteristics of indexing languages used therein, the authors develop a 
rational basis for use of the Universal Decimal Classification (UDC) in this 
context. Practical design considerations for the use of UDC in a mechanized 
retrieval system are discussed. Examples are reported of the use of UDC as 
the indexing language with the Combined Pile Search System, an existing 
retrieval system for the IBM I4OI, used by several large information centers 
in the United States. Finally, the authors discuss how UDC might be used as 
a query language in a typical retrieval system of the near future in which 
the user interacts directly with the computer-stored document reference fil-3. 

The authors conclude that it is technically feasible to roe UDC in mechanized 
retrieval systems and that, under certain conditions, it may be desirable. 

Some of these conditions are the existence of large files already indexed by 
UDC, staff already trained for its use, and extensive international use or 
exchange of materials of the system. 

6. Freeman, Robert R. , Evaluation of the Retrieval of Metallurgical Document 
R eferences Using the Universal Decimal Classification in a Computer- 
Based System , Report AIP/UDC-6 urxder National Science Foundation Grant 
GN-433, New York, American Institute of Physics, April 1, I968. 

A set of twenty-five questions were processed against a computer-stored 
file of 9,159 document references in the field of ferrous metallurgy, 
representing the 19^5 coverage of the Iron and Steel Institute (London) 
information service. A basis for evaluation of system performance charac- 
teristics and analysis of system failures was provided by using questions 
which had previously been processed by the American Society for Metals against 
a data base wh-5ch contained many of the same documents. The Cuadra-Katter 
model for describing the system evaluation environment was used. The results, 
which were highly satisfactory, led to observations and recommendations which 











contrast th© r©(iu.ir©ni6nts for class definition) indexing policy ) and search 
strategy between manual and computer-based systems which use UDC. 

7« Freeman) Robert R. and Pauline Atherton) AUDACIOUS - an Exp eriment with 
an On-Line. Interactive Reference Retrieval System Using the Universal 
Decimal Classification as the Index Language in the Field of N uclear 
Science , Report AIP/UDC-7 under National Science Foundation Grant GN-453* 
New York) American Institute of Physics ) April 25) 19^8 • 

The report describes an experimental system for remote direct access to 
files of computer-stored information which has been indexed by the Universal 
Decimal Classification (UDC). The data base for the experiment consisted of 
references from a single issue of Nuclear Science. Abstracts * The Speci&l 
Subject Edition of UDC for Nuclear Science and Technology , was also stored in 
the computer so that users could discover how to translate their questions 
from natural language to logical statement containing UDC numbers* 

The authors conclude that the technical feasibility of use of existing 
classification and indexing tools ) such as UDC) has been demonstrated. 

However) detailed attention to all facets of man-machine communication is 
a necessity if systems are to be designed which will be voluntarily used. 
AUDACIOUS is reviewed and criticized from this point of view. 

Finally) the authors conclude that the use of UDC in an on-line, interactive 
system may have important ramifications for the development of international 
information networks. Conversion tables (schedules) already exist which would 
allow speakers of many languages to search files indexed by UDC without regard 
to national or linguistic boundaries. 

8. Atherton, Pauline, Donald W. King, and Robert R. Freeman, Evaluation of 
the Retrieval of Nuclear Science Document References Using the Universal 
Decimal Classification in a Computer-Based System , Report AIP/TOC-8 
under National Science Foundation Grant GN-455» New York, American 
Institute of Physics, May 1, 1968. 

A single issue of Nuclear Science Abstracts , containing about 2,500 abstracts, 
was indexed by UDC , using the Special Sub.iect Edition of UDC for Nuclear 
Science and Technology . The descriptive cataloging and UDC-indexing records 
formed a computer— stored data base. A systematic random sample of 500 
additional abstracts, taken from a collection of about 196,000, was also 
indexed by UDC. An experimental design was developed such that the potential 
results of retrieval tests with the full collection could be inferred from 
actup ‘ results obtained from the two smaller data bases. 

Sixty questions were collected from nuclear science research organizations 
in North America and Europe. Two search analysts, neither of \7hom was 
familiar with the policies and practices of the indexers, formulated logical 
search statements with UDC numbers. The resulting queries were processed 
against the UDC-indexed data bases. They were also processed by two other 
information services. Twelve questions, a subset of the original sixty, 
were chosen for more detailed analysis. The results are presented in the 
report . 
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9. 



The background, objectives, and accomplishments of the project are 
reviewed briefly. Specific areas discussed are English language TJDC schedules, 
a computer-based UDC file management system, data bases for retrieval experi- 
ments, batch-process and on-line, interactive information retrieval systems, 
and retrieval system evaluation. The conclusions deal with the usefulness 
of the TJDC for mechanized retrieval systems, needed research, needed 
organizational effort, and a proposed international seminar. Several 
appendices summarize the current state of the TJDC in English and the 
availability of magnetic tape and microfilm files developed by the project. 



Freeman, Rotert E., and Pauline Atherton, Final Report of the Resea rch 
Proiect for the Evaluation of the UDC a s the Indexing Languag 
S^nLfd Referenoe Retrieval System,. Report AIP/m)S-9 l^biMi^ant 

GN-433. New York, American Institute of Physics, May 1, 19 do« 
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APPENDIX 2 

AIP/UDC Project Magnetic Tapes and Other Tapes Containing AIP/UDC Project 
Materials 



A. Inquiries concerning the following tapes should be directed to the 
Information Division, American Institute of Physics, 555 East 45th Street, 
New York, New York 1001?. 



N 



Tape 


Tracks 


Density 


AIP-1 


9 


800 


AIP-2 


9 


800 


AIP-5 


9 


800 


AIP-4 


7 


800 


AIP-5 


9 


800 


AIP-6 


9 


800 


AIP-7 


7 


556 


AIP-8 


7 


556 


AIP-9 


9 


800 


AIP-1 0 


7 


556 


AIP-11 




- 


AIP-1 2 


9 


800 


AIP-1 5 


7 


800 


AIP-1 4 


9 


800 


AIP-1 5 


9 


800 


AIP-1 6 


7 


800 


AIP-1 7 


7 


800 


AIP-1 8 


9 


800 



Contents 



Geology document file, reel 1 of 2 
Geology document file, reel 2 of 2 
UDC English Language Master Pile (see 
Appendix 5) 

UDC English Language Master Pile 
Geology descriptor file 
Metallurgy descriptor file 
UDC Special Subject Edition for Metallurgy 
Combined Pile Search System Program Tape 
Metallurgy document file 
UDC numbers from German Medium Edition 
(numbers only, no text) 



UDC Reverse Cross reference file 

(see Appendix 5) - 

UDC Abridged Building Classification - 19b3 
Nuclear Science document file 
Nuclear Science descriptor file 
Nuclear Science document file 
Nuclear Science descriptor file 
Combined Pile Search System - System Tape 



B. Inquiries concerning the following 

American Meteorological Society, P.O. Box 1756, Washington, D.C. 20015- 



Tape 


Tracks 


Density 


AMS-1 


9 


800 


AMS-2 


9 


800 


AMS-5 


9 


800 


AMS-4 


9 


800 


AMS-5 


9 


800 


AMS-6 


9 


800 



Contents 



Meteorology document file 
Meteorology descriptor file 
Oceanography document file 
Oceanography descriptor file 
Antarctic document file 
Antarctic descriptor file 












SUMMARY OF THE STATUS OF THE UNIVERSAL DECIMAL CLASSIFICATION IN ENGLISH, 
INCLUDING PROGRESS BY THE AIP/UDC PROJECT THROUGH 31 December 1967 



KEY TO CODES USED IN TABLE 

I I Published by British Standard Institution [BSl] (except for Special 

Subject Editions) 

2 s Unpublished manuscript 

2a = Manuscript reported available at BSl, but not in possession of AIP/ 

UDC Project Staff 

3 s Section entirely covered in published Special Subject Edition 
4 - English Medium Edition text derived by condensation and editing of 
published full edition of manuscript through comparison with German 
Medium Edition* and Extensions and Corrections (E4*C) 

4a - Unofficial translation completed by AIP/UDC Project Staff 

I 5 - Manuscript not yet completed 

All check-marked entries in the following table correspond to records 

included in the AIP/UDC Project English Language Master File (Appendix 5) 
All entries bearing the code number 4 in the following table correspond to 
English Medium Edition manuscript included in Appendix 4. 



*Deutscher Normenausscbuss, Deeimalklassifikation: DK-Handausga^, BandJ, 
gyaf-Amfli-^fiche Taf eln . Beuth-Vertrieb GmbH, Berlin, and Koln, 1967. 
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UDC SECTION 




*JV 01 


71 Physical Planning 


72 Architecture 


9 

e 

e 

‘5 

a. 

9 

ii 

oe 
• a 
« c 
5UJ 
a 

3 

U 

(/) 

le 

fO 


77 Photography 


78 Music 


79/795 Entertainment, Theatre, Games 


796/799 Sports 


8/81 Language and Linguistics 


82/89 Literature 


9 Geog., History, Biography 


AUXILIARIES 


= Language 


(0..) Form of Work 


(1/9) Place 


(=...) Race, Nationality 


Time 


.00 Point of View 


SPECIAL SUBJECT EDITIONS 


Nuclear Science (NS) 


Metallurgy (ME) 


Education (ED) 


Building (ABC) 


Standardization 


Polar Regions 


Sport and Physical Education (SP) 



APPEiroiX 4 



APPENDIX 4 



UDC ENGLISH MEDIIM EDITION MANUSCRIPT. CLASSES 0, 1, 2, 3, 5, 7, 8 and 9 

Introduction . This appendix includes the full manuscript for the main 
classes of an English language medium edition of UDC with the exception 
of class 6, the largest class. The manuscript was prepared under the 
supervision of Mr. Geoffrey Lloyd at the headquarters of the International 
Federation for Documentation (FID) in the Hague, with the support of 
funds transferred to FID by the AIP/UDC Project with permission of the 
National Science Foundation. 

The raw material for the manuscript was (1) an earlier version of the 
AIP/UDC Project English Language Master File isdiich is included in this report 
as Appendix 5, (2) recent Extensions and Corrections to the UDC, and (3) the 
German and French Medium Editions of UDC. Of the total manuscript reproduced 
here, all but class 5 was subsequently entered into the English Language 
Master File. 

By arrangement with FID, the British Standards Institution has tenta- 
tively agreed to publish the English Medium Edition. Actual publication 
will be contingent upon completion of UDC class 6, the auxiliary classes, 
and an alphabetic index to the edition. In the interests of making the 
manuscript available for the use of researchers and teachers of library 
science, the incomplete manuscript, amounting to 687 typewriten pages, 
has been reporduced as described below. 

Availability . Owing to the large number of pages in the manuscript, 
only the introductory pages of this appendix are reproduced here. The 
full manuscript will be made available in microform according to the 
arrangements detailed in appendix 5. 
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APPENDIX 5 



APPENDIX 5 



THE UDC ENGLISH LANGUAGE MASTER FILES 



PART 1: THE CLASSIFICATION SCHEDULE 



PART 2: REVERSE CROSS REFERENCES 



Introduction . This appendix includes the complete, merged set of UDC 
schedules accumulated by the AID /UDC Project from 1965-1967. The file, 
which contains 110,759 records, is stored on magnetic tape recorded at 
a density of 800 cpi in IBM BCD code. Both 7 and 9-track versions 
exist. The reverse cross reference file, which contains 24,140 records 
is also stored on magnetic tape. 



Each record in the schedule includes the UDC number, a code which 
represents the source of the record, and the English language equivalent 
of the UDC number. Many records also contain cross references 
and scope notes vdiich serve to further define or delimit a concept. 



Cross references were identified and recorded in a separate file, 
as well as in the UDC schedule. The former was sorted and printed in 
the order of the UDC numbers referred to by a given record. The resulting 
listing appears here as Part 2, the reverse cross reference file. The 
source codes which appear in both parts of this appendix are explained 
below. 
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Source Edition Codes Used in the Classification Schedule and 


Reverse Cross 


Reference File 


Code 


Meaning 


ENAB61 


Abridged Edition, 1961, complete 


ENED65 


Education Edition, 1965 


ENFU— 


Full Edition, unpublished ms. 


ENFU43 


Full Edition, 1943 (partial) 


ENFU54 


Full Edition, 1954 (partial) 


ENFU55 


Full Edition, 1955 (partial) 


ENFU58 


Full Edition, 1958 (partial) 


ENFU64 


Full Edition; 1964 (partial) 


ENNE67 


Medium Edition, 1967 (not yet published) 


ENNT64 


Metallurgy Edition, 1964 


ENNS64 


Nuclear Science Edition, 1964 


ENSP64 


Sport and Physical Education Edition, 1964 



(EN indicates an English language schedule. Two letters serve as 
an edition code and two digits identify the year of publication). 
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introductory section of the appendix is reproduced in the final report. 
The following arrangements have been made in order to assure availability 
of the full appendix to persons interested in its use for research 

and teaching purposes. 

1. The appendix has been recorded on 16 mm negative reel microfilm. 
Arrangements for obtaining copies may be made by contacting the Infor- 
mation Division, American Institute of Physics, 335 East 45th Street, 

New York, New York 10017 . 



2. The final report including this appendix will be available on 
microfiche through the Educational Resources Information Center (ERIC) 
of the U«S. Office of Education. The final report will be processed 
by the ERIC Clearinghouse on Library and Information Sciences, 

2122 Riverside Avenue, Minneapolis, Minnesota 55^®^* After an- 
nouncement in the ERIC abstracting service. Research in Education, 
the appendix will be available through the ERIC Document Reproduction 
Service, the National Cash Register Co., 4936 Fairmont Avenue, 
Bethesda, Maryland 20014. 

3. Since the appendix will be available throu^ the ERIC 
system, it will not be made available through the Clearinghouse 
for Federal Scientific and Technical Information. 
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